Conversion of categorical variables into numerical variables via Bayesian network classifiers for binary classifications
نویسندگان
چکیده
Many pattern classification algorithms such as Support Vector Machines (SVMs), MultiLayer Perceptrons (MLPs), and K-Nearest Neighbors (KNNs) require data to consist of purely numerical variables. However many real world data consist of both categorical and numerical variables. In this paper we suggest an effective method of converting the mixed data of categorical and numerical variables into data of purely numerical variables for binary classifications. Since the suggested method is based on the theory of learning Bayesian Network Classifiers (BNCs), it is computationally efficient and robust to noises and data losses. Also the suggestedmethod is expected to extract sufficient information for estimating a minimum-error-rate (MER) classifier. Simulations on artificial data sets and real world data sets are conducted to demonstrate the competitiveness of the suggested methodwhen the number of values in each categorical variable is large andBNCs accurately model the data. © 2009 Elsevier B.V. All rights reserved.
منابع مشابه
Using Bayesian networks with rule extraction to infer the risk of weed infestation in a corn-crop
This paper describes the modeling of a weed infestation risk inference system that implements a collaborative inference scheme based on rules extracted from two Bayesian network classifiers. The first Bayesian classifier infers a categorical variable value for the weed–crop competitiveness using as input categorical variables for the total density of weeds and corresponding proportions of narro...
متن کاملA New Nonlinear Specification of Structural Breaks for Money Demand in Iran
In a structural time series regression model, binary variables have been used to quantify qualitative or categorical quantitative events such as politic and economic structural breaks, regions, age groups and etc. The use of the binary dummy variables is not reasonable because the effect of an event decreases (increases) gradually over time not at once. The simple and basic idea in this paper i...
متن کاملBayesian Chain Classifiers for Multidimensional Classification
In multidimensional classification the goal is to assign an instance to a set of different classes. This task is normally addressed either by defining a compound class variable with all the possible combinations of classes (label power-set methods, LPMs) or by building independent classifiers for each class (binary-relevance methods, BRMs). However, LPMs do not scale well and BRMs ignore the de...
متن کاملRisk Analysis of Operating Room Using the Fuzzy Bayesian Network Model
To enhance Patient’s safety, we need effective methods for risk management. This work aims to propose an integrated approach to risk management for a hospital system. To improve patient’s safety, we should develop flexible methods where different aspects of risk and type of information are taken into consideration. This paper proposes a fuzzy Bayesian network to model and analyze risk in the op...
متن کاملLearning Bayesian Network Structure using Markov Blanket in K2 Algorithm
A Bayesian network is a graphical model that represents a set of random variables and their causal relationship via a Directed Acyclic Graph (DAG). There are basically two methods used for learning Bayesian network: parameter-learning and structure-learning. One of the most effective structure-learning methods is K2 algorithm. Because the performance of the K2 algorithm depends on node...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Statistics & Data Analysis
دوره 54 شماره
صفحات -
تاریخ انتشار 2010